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Abstract. - Fhe Mike-Farmer (MF) model was constructed empirically based on the continuous double auc- 
tion mechanism in an order-driven market, which can successfully reproduce the cubic law of returns and 
the diffusive behavior of stock prices at the transaction level. However, the volatility (defined by absolute 
return) in the MF model does not show sound long memory. We propose a modified version of the MF model 
by including a new ingredient, that is, long memory in the aggressiveness (quantified by the relative prices) 
of incoming orders, which is an important stylized fact identified by analyzing the order flows of 23 liquid 
Chinese stocks. Long memory emerges in the volatility synthesized from the modified MF model with the 
DFA scaling exponent close to 0.76, and the cubic law of returns and the diffusive behavior of prices are also 
produced at the same time. We also find that the long memory of order signs has no impact on the long mem- 
ory property of volatility, and the memory effect of order aggressiveness has little impact on the diffusiveness 
of stock prices. 
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Introduction. - The continuous double auction mecha- 
nism is adopted in the electronic trading systems in many stock 
markets worldwide. In particular, most emerging stock mar- 
kets are order-driven markets. In a pure order-driven market, 
there are no market makers or specialists, and market partici- 
pants submit and cancel orders, which may result in transac- 
tions based on price-time priority. Different from quote-driven 
markets where market makers are liquidity providers, the same 
trader in an order-driven market can act as either a liquidity 
taker or a liquidity provider depending on the aggressiveness of 
her submitted orders. The behaviors of market makers are very 
complicated, since they have the obligation to maintain the liq- 
uidity of stocks and in the meanwhile want to maximize their 
profits. It is thus natural to argue that it is easier to construct 
microscopic models for order-driven markets than for quote- 
driven markets in order to understand the macroscopic regular- 
ities of stock markets from a microscopic angle of view. 

Indeed, a lot of efforts have been made to construct order- 
driven models [1], which can be dated back to the 1960's [2]. 
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In order to check if the model captures some basic aspects of 
the underlying mechanisms governing the evolution of stock 
prices, one usually investigates the statistical properties of the 
mock stocks, such as the distribution and autocorrelation of re- 
turns and the long memory in volatility. Deviations from these 
well-established stylized facts allow us to improve the models 
and gain a better understanding of the underlying microscopic 
mechanisms. For instance, the DFA scaling exponent of price 
fluctuations is found to be significantly less than the empirical 
value in the Bak-Paczuski-Shubik model [3] and in the Maslov 
model [4], leading to new order-driven models [5-8]. 

Recently, Mike and Farmer have proposed an empirical be- 
havioral model, which is based on the statistical properties 
of order placement and cancelation extracted from ultrahigh- 
frequency stock data [9]. To the best of our knowledge, the 
Mike-Farmer model (or MF model for short) is the only em- 
pirical model, which outperforms other order-driven models 
and is adaptive for further improvement. The MF model can 
reproduce several important stylized facts: The returns are dis- 
tributed according to the cubic law, the DFA scaling exponent 
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of returns is close to 0.5, and the spreads and lifetimes of or- 
ders have power-law tails. However, the DFA scaling exponent 
of the volatility is also found to be H v as 0.6, which is much 
less than the empirical value of H v w 0.8 [9]. In this work, we 
propose a modified version of the MF model, which is able to 
produce very realistic strong persistence in the volatility with- 
out destruction of other stylized facts. 

The volatility clustering phenomenon, as well as other im- 
portant stylized facts, can be observed in many other micro- 
scopic market models. In the econophysics literature, physi- 
cists model stock markets as a complex system with interacting 
agents and different physics scenarios lead to different types of 
models [10], such as percolation models [11-16], spin mod- 
els [17-22], minority games [23-30], majority games [31-33], 
and the $-game [34], to list a few. There is also a long list 
of stock market models in the economics literature [35]. In 
contrast with models where the agents (or traders) are ho- 
mogenous, most of economic models assume that the traders 
have bounded rationality and heterogeneous beliefs [36,37]. 
Traders can thus be classified into two types: fundamentalists 
and chartists. The fundamentalists believe that the asset price is 
solely determined by economic fundamentals and they buy (or 
sell) when the price is lower (or higher) than the fundamental 
price. On the contrary, chartists are trend followers and try to 
predict future price movement according to diverse techniques. 
Many theoretical and computational oriented models have been 
proposed [38^-6]. 

Mike-Farmer model and its modification. - The MF 

model contains two main parts, order placement and cance- 
lation. In order to submit an order, one needs to decide its 
direction (buy or sell), price and size. In the MF model, the 
size of any order is fixed to one. The sign of orders presents 
strong long memory, with H s ?» 0.8 [47]. Therefore, order 
signs can be generated from fractional Brownian motions with 
DFA scaling exponent H s . The price of an incoming order can 
be characterized by the relative price x, which is the logarith- 
mic distance of the order price to the same best price: 



x(t) 



hi7r(t) -ln7T 6 (£-l), 
lri7r Q (i-l)-lri7r(i), 



buy orders 
sell orders 



(1) 



where ir(t) is the order price at time t, and ni,(t — 1) and 
w a (t — 1) are the best bid and best ask at time t — 1, respec- 
tively. The relative prices in the MF model are generated from a 
Student distribution whose degrees of freedom a x and scaling 
parameter a x are determined empirically using real stock data. 
Mike and Farmer also proposed a model for order cancelation 
combining three factors: the position of an order in the order 
book, the imbalance of buy and sell orders in the book, and the 
total number of orders in the book. 

With these findings in hand, our simulations of the MF model 
can be described as follows. Before the evolution of prices, we 
generate an array of relative prices {x(t) : t = 1, 2, • • • , T}, 
drawn from the Student distribution with a x = 1.3 and a x = 
0.0024, and an array of signs {s(i) : t = 1, 2, • • • ,T} accord- 
ing to a fractional Brownian motion with H s = 0.75. At each 
simulation step t, an order is generated, whose relative price 



and direction are x(t) and s(t), respectively. If x(t) is not less 
than the spread, the order is an effective market order, resulting 
in an immediate execution with a limit order waiting at the op- 
posite best price. Otherwise, the incoming order is an effective 
limit order, which is stored in the queue of the limit order book. 
Then we scan the standing orders to check if any of them can be 
canceled, following exactly the same process in the MF model. 
We simulate T = 2x 10 5 steps in each round. The stock prices 
are recorded and we analyze the last 4 x 10 4 returns in each 
round. 

The distribution of returns in the MF model has been stud- 
ied in detail and we reproduced the cubic law [48]. We now 
perform a detrended fluctuation analysis (DFA) [49, 50] on the 
return r and the volatility v = \r\ to estimate the DFA scaling 
exponents. The results are shown in Fig. 1. Excellent power- 
law dependence of the detrended fluctuation function F(£) with 
respect to the timescale £ is observed for the two quantities 
in the scaling range 8 ^ £ < 7000. The DFA scaling ex- 
ponents are H r = 0.55 for the returns and H v = 0.58 for 
the volatility, respectively. These indexes are merely a little 
greater than 0.5, which means that there is no long memory or 
very weak memory in the returns and the volatility. To obtain 
a solid picture, we repeated the simulations of the MF model 
20 times and performed DFA on the returns and the volatil- 
ity. We find that H r varies in the range [0.54, 0.58] with the 
average H r = 0.57 ± 0.01 for the returns, and H v varies in 
the range [0.56, 0.62] with the average H v = 0.59 ± 0.01 for 
the volatility. This analysis confirms the results of Mike and 
Farmer [9]. It is well accepted in mainstream Finance that there 
is no memory in returns [51], consistent with the weak-form 
market efficiency hypothesis, while the volatility possess strong 
persistence with the DFA scaling exponent much greater than 
0.5 [52]. Therefore, the MF model captures the stylized fact 
that H r of returns is close to 0.5, but fails to reproduce strong 
memory effect in the volatility. Obviously, certain important 
feature is missing in the original MF model, which calls for a 
further scrutiny of the real stock data and a modification of the 
model. 

In financial markets, it is impossible for a trader to collect 
and digest all information that is available publicly, and it is not 
free to collect and process diverse information from different 
sources. Due to the limited processing power of human brains 
and finite amount of money, it is not irrational for traders to 
mimic the trading behaviors of others, which may lead to posi- 
tive feedbacks and herding behaviors in an intermittent fashion. 
In other words, most traders in financial markets play a majority 
game. They are more willing to buy when the price rises and to 
sell when the price falls. This scenario is known as the informa- 
tion cascading mechanism [53] and it is well documented that 
imitation and herding cause the emergence of volatility cluster- 
ing and long memory. A comprehensive taxonomy of herd be- 
havior was synthesized by Hirshleifer and Teoh [54]. We also 
refer to an excellent book of Lyons for a modern treatment [55] . 
Following this line, a trader is very possible to submit an order 
that is "similar" to its preceding limit orders. In addition, the 
long memory in the order flow is well-known as "diagonal ef- 
fect" [56]. Other than herding, there are at least two alternative 



p-2 



Emergence of long memory in stock volatility from a modified Mike-Farmer model 




Fig. 1: Detrended fluctuation function F(£) as a function of time lag 
£ for the returns and the volatility, respectively. The solid lines are the 
linear least-squares fits to the data and H r = 0.55 ± 0.01 for returns 
and H v = 0.58 ± 0.01 for volatility. The plot for volatility has been 
shifted vertically for clarity. 



hypotheses for the origin of long memory in the order flow: 
order splitting and traders reacting similarly to the same sig- 
nal [56]. Since an order is fully determined by its direction 
(order sign), aggressiveness (order price) and size, we expect 
that these variables might also have strong memory. In the MF 
model, the directions of incoming orders are modeled by frac- 
tional Brownian motions with H s ^ 0.5, while the order size 
is fixed. It is thus worthwhile to check if the order aggressive- 
ness characterized by relative prices has long memory using 
real ultrahigh-frequency stock data, and if the long memory in 
the order aggressiveness, if any, can cause the emergence of 
long memory in the volatility. 




Fig. 2: Dependence of the detrended fluctuation function F(£) with re- 
spect to the timescale £ for four stocks, whose stock codes are 000012, 
000089, 000406 and 000488. The solid lines are the linear least- 
squares fits to the data and H x = 0.77±0.01, 0.76±0.01, 0.77±0.01, 
and 0.72 ± 0.01, respectively. The plots for stocks 000089, 000406 
and 000488 have been shifted vertically for clarity. 

In order to study the memory effect of order aggressiveness, 
we utilize a nice database of 23 liquid stocks listed on the 
Shenzhen Stock Exchange in the whole year 2003 [57]. The 



database contains detailed information of the incoming order 
flow, such as order direction and size, limit price, time, best bid, 
best ask, transaction volume, and so on. We focus on the rel- 
ative prices of orders submitted during the continuous double 
auction. Figure 2 illustrates the dependence of the detrended 
fluctuation functions F(£) with respect to the timescale £ for 
four randomly chosen stocks. Sound power-law scaling rela- 
tions are observed in the scaling ranges spanning four orders of 
magnitude. The DFA scaling exponents of the relative prices 
for the four stocks are estimated to be H x = 0.77 ± 0.01 in 
the scaling range 10 ^ £ < 10 5 , 0.76 ± 0.01 in the scal- 
ing range 10 ^ £ < 7 x 10 4 , 0.77 ± 0.01 in the scaling 
range 10 ^ £ < 10 5 , and 0.72 ± 0.01 in the scaling range 
10 ^ £ < 5 x 10 4 , respectively. The DFA results for other 
stocks are quite similar. We find that H x varies in the range 
[0.72,0.87] with an average H x = 0.78 ± 0.03. It is evident 
that the relative price x is super-diffusive and possesses long- 
term dependence. 

It is noteworthy to point out that the long memory temporal 
structure in the relative prices was also observed in the London 
Stock Exchange. Zovko and Farmer studied the autocorrelation 
function of relative prices for buy orders and sell orders of 50 
stocks traded on the London Stock Exchange [58]. They found 
that the autocorrelation function decays as a power law with 
exponent 7 = 0.41 ± 0.07. It follows immediately that H x = 
1 - 7/2 = 0.80 ± 0.04 [59]. We also performed detrended 
fluctuation analysis of the relative prices for buy orders and sell 
orders of the four stocks analyzed in Fig. 2. The exponents are 
0.75 ± 0.01, 0.81 ± 0.01, 0.77 ± 0.01 and 0.70 ± 0.01 for buy 
orders and 0.77±0.01, 0.75±0.01, 0.75±0.01 and 0.71±0.01 
for sell orders. There is no significant difference in the memory 
properties if one considers relative prices of orders on the same 
side of the book. 

Based on the above empirical finding that the relative prices 
have long memory, we can introduce a new ingredient in the 
MF model. The modified MF model inherits all the ingredi- 
ents of the MF model except that the relative prices are gener- 
ated from a Student distribution with long memory. This can 
be done as follows. We generate an array of relative prices 
{xo(t) : t = 1, 2, • • • , T} from a Student distribution. Then 
we simulate a fractional Brownian motion with H x = 0.8 and 
record its differences as {y{t) : t = 1,2,- •• , T}. The se- 
quence {xo(t) : t = 1, 2, • • • ,T} is rearranged such that the 
rearranged series {x(t) : t = 1, 2, ■ • • , T} has the same rank 
ordering as {y(t) : t = 1, 2, • • • , T}, that is, x(t) should rank n 
in sequence {x(t) : t = 1, 2, • • • , T} if and only if y(t) ranks n 
in the {y(t) : t = 1, 2, • • • , T} sequence [60,61]. It is obvious 
that x(t) still obeys the same Student distribution. A detrended 
fluctuation analysis of x(t) shows that its DFA scaling expo- 
nent is very close to H x — 0.8. This sequence of x(t) is used 
as the relative prices in our modified MF model. 

Numerical results. - Based on the modified MF model 
discussed above, we first generate the relative prices x from 
the Student distribution with parameters a x = 1.3 and a x = 
0.0024. Then we add long memory to the time series, using 
H T ss 0.8. In each round, we simulate the modified MF model 
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2 x 10 5 steps with the same parameters H s = 0.75, A = 1.12 
and B = 0.2 and record the return time series with the length 
near 4 x 10 4 after removing the transient period. In Fig. 3, we 
illustrate a typical segment of the simulated returns from the 
modified MF model, which is compared with the return time 
series of a real Chinese stock (code 000012) and the original 
MF model. It is evident that the return time series of the mod- 
ified MF model exhibits clear clustering resembling the clus- 
tering phenomenon in real data, whereas the simulated returns 
from the original MF model do not show clear clustering fea- 
ture. This already indicates qualitatively that the volatility of 
the modified MF model has stronger long-term memory than 
that of the original MF model. 
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Fig. 3: Comparison of typical return time series from a real Chinese 
stock 000012 (upper panel), the original MF model (middle panel), 
and the modified MF model (lower panel). 



with the scaling range spanning about three orders of mag- 
nitude. We obtain H v = 0.76 ± 0.01 in the scaling range 
8 ^ £ < 4500, which is in excellent agreement with empiri- 
cal results. We also performed a detrended fluctuation analysis 
on the returns. The results are also presented in Fig. 4. We find 
that H r = 0.53 ± 0.01 in the scaling range 8 sC £ < 4500, 
consistent with empirical results. Comparing with Fig. 1, we 
conclude that the value of H x has little impact on H r . We re- 
peated this process for 20 times and the results are very similar. 
The exponent H v varies in the range [0.74, 0.77] with an aver- 
age H v =_0.76 ± 0.01, while H r ranges in [0.53, 0.55] with an 
average!?,. = 0.54 ±0.01. 

In order to further inspect the quantitative relation between 
H x and H v , more simulations with different values of H x have 
been performed. For each fixed H x , repeated simulations do 
not show much fluctuation in H v . The results are shown in 
Table 1. It is found that H v is not identical to H x . However, 
H v increases with H x . Table 1 also confirms that H r is close 
to 0.5 and independent of H x . The relation between volatility 
clustering and relative prices has been detected and investigated 
for stocks on the London Stock Exchange [58]. 



Table 1: Dependence of H v and H r on H x . For each value of H x , 
ten repeated simulations are conducted. The scaling range is 8 ^ 
£ < 4500. The numbers in the parentheses are the standard deviations 
divided by 100. 



H x 


0.50 


0.60 


0.70 


0.80 


0.90 


Hr 

H r 


0.57(1) 
0.55(1) 


0.61(1) 
0.55(1) 


0.67(1) 
0.54(1) 


0.76(1) 
0.54(1) 


0.81(2) 
0.54(1) 




Fig. 4: Detrended fluctuation analysis of the returns r and the volatility 
v generated according to the modified MF model. The solid lines are 
the linear least-squares fits to the data and H r = 0.53 ± 0.01 for the 
returns and H v — 0.76 ± 0.01 for the volatility. The plot for volatility 
has been shifted vertically for clarity. 

To quantify the strength of the memory effect in the sim- 
ulated volatility, we have performed the detrended fluctuation 
analysis. Figure 4 shows the dependence of the detrended fluc- 
tuation F(£) as a function of the timescale £ in log-log coor- 
dinates. We find that F(£) scales as a power law against £ 



Figure 5 shows the empirical complementary cumulative dis- 
tribution P(> v) of the volatility generated according to the 
modified MF model. We find that the volatility has a power- 
law tail 



P(>v) 



-0 



(2) 



where (3 is the tail index. Using the least-squares fitting 
method, we obtain that j3 = 2.99 ± 0.02, identical to 3. In 
other words, the volatility obeys the well-known cubic law [62], 
which is captured by the original MF model [9,48]. 

Additional numerical experiments show that the cancelation 
process in the modified MF model is not the only one to repro- 
duce the main stylized facts. The modified MF model with 
a Poissonian cancelation process gives H r = 0.51 ± 0.01, 
H v = 0.81 ± 0.01, and f3 = 3.19 ± 0.03. 

Beside efficiency and long memory of the volatility and the 
cubic law of the return, the price dynamics is characterized 
by multifractality [52]. We adopted the multifractal detrended 
fluctuation analysis [63] to investigate the return and volatil- 
ity time series generated from the MF model, the modified MF 
model, and the real data as well for comparison. For a given 
time series, the q-th order detrended fluctuation function F q (s) 
scales as a power law 



F q (s) 



,h(q) 



(3) 
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Fig. 5: Empirical complementary cumulative distribution P(> v) of 
the volatility generated according to the modified MF model in double 
logarithmic coordinates. The solid line is the best power-law fit to the 
data with the tail index (3 = 2.99 ± 0.02. 



and the mass exponent r(q) in the standard textbook structure 
function formalism is [63] 



r(q) = qH(q) - 1 



(4) 



Note that H(q = 2) is the DFA scaling exponent characterizing 
the long memory property of the time series. The mass expo- 
nent r(q) of each financial variable is plotted in Fig. 6 as a func- 
tion of q. When q = 0, t(0) = —1 for each case, as predicted 
by Eq. (4). It is evident that all r(q) functions are nonlinear 
with respect to q, which confirms the multifractal nature of re- 
turn and volatility in both models and in real data. When q < 0, 
both models deviate remarkable from real data. When q ^ 0, 
both models reproduce quantitatively similar r(q) function of 
the return as real data, and the r(q) function for the volatility 
from the MF model deviates from that of the real data while the 
modified MF model captures excellently the multifractality in 
real data. 
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Fig. 6: Multifractal detrended fluctuation analysis of the returns r and 
the volatility v generated according to the MF model and the modified 
MF model with comparison to the multifractal nature of the real data. 



law and nonpersistence in the returns. The last but not least 
question is if the long memory in the relative prices alone can 
reproduce the long memory in the volatility when there is no 
memory in the order signs. To address this question, we per- 
formed extensive simulations following the MF model but with 
H s = 0.5 and H x = 0.8. We find that the H v = 0.78, re- 
maining unchanged when compared with the modified model 
in which H s = 0.75 and H x = 0.8. Moreover, the volatility 
is also distributed according to the cubic law. In addition, we 
have H r = 0.42, indicating that the prices evolve in a weak 
sub-diffusive behavior, which is nevertheless not far from the 
diffusive regime with H r = 0.5. We note that some stocks do 
show weak sub-diffusion effect [51]. 

Concluding remarks. - In summary, we have improved 
the Mike-Farmer model for order-driven markets by introduc- 
ing long memory in the order aggressiveness, which is an im- 
portnat stylized fact identified using the ultra-high-frequency 
data of 23 liquid Chinese stocks traded on the Shenzhen Stock 
Exchange in 2003. A detrended fluctuation analysis of the 
relative prices x unveils that H x = 0.78 ± 0.03. The modi- 
fied MF model is able to produce long memory in the volatil- 
ity with H v = 0.79 ± 0.02, which is much greater than 
H v = 0.59 ±0.01 obtained from the original MF model. When 
we investigate the temporal correlation of returns, we find that 
H r = 0.53 ± 0.01, indicating that the prices are diffusive. In 
addition, the cubic law for the return distribution holds in the 
modified MF model. Our modified MF model also enables us to 
distinguish the isolated memory effects of order directions (H s ) 
and aggressiveness (H x ) on the correlations in returns (H r ) and 
the volatility (H v ). We find that H v is strongly dependent of 
H x and irrelevant to H s . In contrast, H r depends strongly on 
H s with little impact from H x . We confirmed that both the MF 
model and the modified MF model are able to produce multi- 
fractality in the simulated prices. 

The price formation process is fully determined by the dy- 
namics of order submission and order cancelation. Intuitively, 
the order submission process has more important impact on the 
emergence of long memory in the volatility. There are four 
factors in the order submission process, the DFA scaling expo- 
nent H a of order signs, the order size, the distribution f(x) and 
the DFA scaling exponent H x of relative prices. Our simula- 
tions show that the distribution f(x) might have impact on the 
return distribution [48] but not the long memory in the volatil- 
ity. Therefore, we figure that the long memory of order aggres- 
siveness is a nontrivial main component of volatility clustering. 
To be more rigorous, order size may be an alternative compo- 
nent of volatility clustering. Indeed, order sizes are also long- 
term correlated [64-68] and there is well-established positive 
volume-volatility correlation [69]. However, the MF model and 
the modified MF model do not include order size as an ingredi- 
ent. This issue could be addressed when a more realistic model 
is available, which is beyond the scope of the current work. 



We have shown that our modified MF model is able to pro- 
duce long memory in the volatility while keeping the cubic 
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